Search results for " Pattern discovery"

showing 10 items of 10 documents

Sequential Mining Classification

2017

Sequential pattern mining is a data mining technique that aims to extract and analyze frequent subsequences from sequences of events or items with time constraint. Sequence data mining was introduced in 1995 with the well-known Apriori algorithm. The algorithm studied the transactions through time, in order to extract frequent patterns from the sequences of products related to a customer. Later, this technique became useful in many applications: DNA researches, medical diagnosis and prevention, telecommunications, etc. GSP, SPAM, SPADE, PrefixSPan and other advanced algorithms followed. View the evolution of data mining techniques based on sequential data, this paper discusses the multiple …

Apriori algorithmComputer sciencebusiness.industryData stream miningConcept mining02 engineering and technologycomputer.software_genreMachine learningGSP AlgorithmTree (data structure)Statistical classificationComputingMethodologies_PATTERNRECOGNITION020204 information systems0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingData miningArtificial intelligencebusinessK-optimal pattern discoverycomputerFSA-Red Algorithm2017 International Conference on Computer and Applications (ICCA)
researchProduct

Discovering representative models in large time series databases

2004

The discovery of frequently occurring patterns in a time series could be important in several application contexts. As an example, the analysis of frequent patterns in biomedical observations could allow to perform diagnosis and/or prognosis. Moreover, the efficient discovery of frequent patterns may play an important role in several data mining tasks such as association rule discovery, clustering and classification. However, in order to identify interesting repetitions, it is necessary to allow errors in the matching patterns; in this context, it is difficult to select one pattern particularly suited to represent the set of similar ones, whereas modelling this set with a single model could…

Association rule learningDiscretizationComputer scienceContext (language use)Correlation and dependencecomputer.software_genreSet (abstract data type)CardinalityKnowledge extractionMotif extraction Pattern discoveryPattern matchingData miningCluster analysisTime complexitycomputer
researchProduct

Pattern Discovery In Biosequences: From Simple To Complex Patterns

2007

Bioinformatics Pattern Discovery String Analysis
researchProduct

Textual data compression in computational biology: Algorithmic techniques

2012

Abstract In a recent review [R. Giancarlo, D. Scaturro, F. Utro, Textual data compression in computational biology: a synopsis, Bioinformatics 25 (2009) 1575–1586] the first systematic organization and presentation of the impact of textual data compression for the analysis of biological data has been given. Its main focus was on a systematic presentation of the key areas of bioinformatics and computational biology where compression has been used together with a technical presentation of how well-known notions from information theory have been adapted to successfully work on biological data. Rather surprisingly, the use of data compression is pervasive in computational biology. Starting from…

Biological dataData Compression Theory and Practice Alignment-free sequence comparison Entropy Huffman coding Hidden Markov Models Kolmogorov complexity Lempel–Ziv compressors Minimum Description Length principle Pattern discovery in bioinformatics Reverse engineering of biological networks Sequence alignmentSettore INF/01 - InformaticaGeneral Computer ScienceKolmogorov complexityComputer scienceSearch engine indexingComputational biologyInformation theoryInformation scienceTheoretical Computer ScienceTechnical PresentationEntropy (information theory)Data compressionComputer Science Review
researchProduct

Characterization and Extraction of Irredundant Tandem Motifs

2012

We address the problem of extracting pairs of subwords (m1,m2) from a text string s of length n, such that, given also an integer constant d in input, m1 and m2 occur in tandem within a maximum distance of d symbols in s. The main effort of this work is to eliminate the possible redundancy from the candidate set of the so found tandem motifs. To this aim, we first introduce the concept of maximality, characterized by four specific conditions, that we show to be not deducible by the corresponding notion of maximality already defined for "simple" (i.e., non tandem) motifs. Then, we further eliminate the remaining redundancy by defining the concept of irredundancy for tandem motifs. We prove t…

Discrete mathematicsRedundancy (information theory)TandemMotif extraction Pattern discoveryText stringLinear numberMathematics
researchProduct

Motif patterns in 2D

2008

AbstractMotif patterns consisting of sequences of intermixed solid and don’t-care characters have been introduced and studied in connection with pattern discovery problems of computational biology and other domains. In order to alleviate the exponential growth of such motifs, notions of maximal saturation and irredundancy have been formulated, whereby more or less compact subsets of the set of all motifs can be extracted, that are capable of expressing all others by suitable combinations. In this paper, we introduce the notion of maximal irredundant motifs in a two-dimensional array and develop initial properties and a combinatorial argument that poses a linear bound on the total number of …

General Computer SciencePattern discoveryTheoretical Computer ScienceCombinatoricsExponential growthMotif extraction Pattern discovery 2D MotifsMotif2D irredundant motifsMotif (music)Pattern matchingRemainderPattern matchingDesign and analysis of algorithmsMathematicsComputer Science(all)Theoretical Computer Science
researchProduct

Flexible pattern discovery with (extended) disjunctive logic programming

2005

The post-genomic era showed up a wide range of new challenging issues for the areas of knowledge discovery and intelligent information management. Among them, the discovery of complex pattern repetitions in string databases plays an important role, specifically in those contexts where even what are to be considered the interesting pattern classes is unknown. This paper provides a contribution in this precise setting, proposing a novel approach, based on disjunctive logic programming extended with several advanced features, for discovering interesting pattern classes from a given data set.

Information managementRange (mathematics)Knowledge extractionbusiness.industryComputer scienceLogical programmingDisjunctive programmingInformation systemMotif extraction Pattern discoveryArtificial intelligenceLevenshtein distancebusinessK-optimal pattern discovery
researchProduct

Derivazione Efficiente di Pattern Strutturati Frequenti da Database di Natura Biologica

2004

Motif extraction Pattern discovery
researchProduct

Optimal extraction of motif patterns in 2D

2009

The combinatorial explosion of motif patterns occurring in 1D and 2D arrays leads to the consideration of special classes of motifs growing linearly with the size of the input array. Such motifs, called irredundant motifs, are able to succinctly represent all of the other motifs occurring in the same array within reasonable time and space bounds. In previous work irredundant motifs were extracted from 2D arrays in O (N 2 log 2 n log log n) and O (N 3) time, where N is the size of the 2D input array and n is its largest dimension. In this paper, we present an algorithm to extract irredundant motifs from 2D arrays that is quadratic in the size of the input. The input is defined on a binary al…

Motif extraction Pattern discovery
researchProduct

Image Compression by 2D Motif Basis

2011

Approaches to image compression and indexing based on extensions to 2D of some of the Lempel-Ziv incremental parsing techniques have been proposed in the recent past. In these approaches, an image is decomposed into a number of patches, consisting each of a square or rectangular solid block. This paper proposes image compression techniques based on patches that are not necessarily solid blocks, but are affected instead by a controlled number of undetermined or don't care pixels. Such patches are chosen from a set of candidate motifs that are extracted in turn from the image 2D motif basis, the latter consisting of a compact set of patterns that result from the autocorrelation of the image w…

Pixelbusiness.industryComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONPattern recognitionData_CODINGANDINFORMATIONTHEORYcomputer.file_formatJPEGImage (mathematics)Compression (functional analysis)Motif extraction Pattern discoveryArtificial intelligencebusinessAlgorithmcomputerImage compressionData compressionMathematicsColor Cell CompressionBlock (data storage)2011 Data Compression Conference
researchProduct